home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Nebula 2
/
Nebula Two.iso
/
SourceCode
/
MiscKit1.7.1
/
MiscKitArchive.mbox
/
mbox
/
000061_kane@sonata.cc.purdue.edu_Sat Oct 2 00:06 MDT 1993.msg
< prev
next >
Wrap
Internet Message Format
|
1994-10-30
|
3KB
Received: from yvax2.byu.edu by maine.et.byu.edu; Sat, 2 Oct 93 00:06:16 -0600
Return-Path: <kane@sonata.cc.purdue.edu>
Received: from DIRECTORY-DAEMON by yvax.byu.edu (PMDF V4.2-13 #4169) id
<01H3M8161VKG94IGWU@yvax.byu.edu>; Sat, 2 Oct 1993 00:04:16 MDT
Received: from alaska.et.byu.edu by yvax.byu.edu (PMDF V4.2-13 #4169) id
<01H3M812E9CG936GS5@yvax.byu.edu>; Sat, 2 Oct 1993 00:04:11 MDT
Received: from yvax.byu.edu by alaska.et.byu.edu; Sat, 2 Oct 93 00:05:54 -0600
Received: from DIRECTORY-DAEMON by yvax.byu.edu (PMDF V4.2-13 #4169) id
<01H3M80PSEM8936CGQ@yvax.byu.edu>; Sat, 2 Oct 1993 00:03:54 MDT
Received: from sonata.cc.purdue.edu by yvax.byu.edu (PMDF V4.2-13 #4169) id
<01H3M80KN9VK9367AS@yvax.byu.edu>; Sat, 2 Oct 1993 00:03:47 MDT
Received: from cantata.cc.purdue.edu by sonata.cc.purdue.edu (5.61/Purdue_CC)
id AA01813; Sat, 2 Oct 93 01:03:41 -0500
Received: by cantata.cc.purdue.edu (NX5.67d/NX3.0X) id AA02262; Sat,
2 Oct 93 01:03:39 -0500
Received: by NeXT.Mailer (1.95)
Received: by NeXT Mailer (1.95)
Date: Sat, 02 Oct 1993 01:03:39 -0500
From: kane@sonata.cc.purdue.edu
Subject: Beta testers wanted: fast string-search algorithms
To: misckit@byu.edu
Message-Id: <9310020603.AA01813@sonata.cc.purdue.edu>
Content-Transfer-Encoding: 7BIT
Status: RO
I am putting the final touches on an implementation of a very
fast string searching algorithm (a variation on the Tuned
Boyer-Moore algorithm due to Hume & Sunday) and would like
some people to pound on it--implement test programs around it,
use it within real programs--whatever. The functions will
be going into the MiscFindPanel class, and perhaps into Don
Yacktman's MiscString class.
These functions (3 of them in the "module") implement literal
pattern matching with options for case sensitivity and
forward/reverse searching, as you'd expect, and are _very_
fast (about as fast as Boyer-Moore with a fast skip loop)
even with all the extra functionality that Boyer & Moore
(or Hume & Sunday) never considered. Simple estimates done
on text sizes up to 100MB with a small (3 byte) pattern
indicate that it scans between 2.0-8.8 MB per second (the
larger the text size, the higher the apparent scan rate),
and larger patterns result in even higher performance
(generally). (This was on an '040 NeXTCube.)
I want to know about any bugs or anything that manages to
cause these functions to crash (think you can crash them with
random pointers? Ha! You'll have to try harder than that.)
I've done testing of my own, and some analyses on the
algorithms, and have corrected all the problems I have found.
Also, I've done all my testing on NeXT boxes; some testing
on an Intel box or two would be very helpful.
Email me at kane@cs.purdue.edu if you are interested in
trying to break these routines. I'd prefer to non-NextMail
them to you (its 7.5KB total), but other options will be
considered.
Thanks,
Christopher Kane
kane@cs.purdue.edu